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International  Neural  Networks  Society  (INNS),  2014.  Leveraged  by  this  grant,  the  PI  has  been  granted  an  ARO  Young  Investigator  Program 
(YIP)  Award  and  a  Defense  University  Research  Instrumentation  Program  (DURIP)  award. 


Enter  List  of  papers  submitted  or  published  that  acknowledge  ARO  support  from  the  start  of 
the  project  to  the  date  of  this  printing.  List  the  papers,  including  journal  references,  in  the 
following  categories: 

(a)  Papers  published  in  peer-reviewed  journals  (N/A  for  none) 


Received 

06/04/2014  2.00 

06/04/2014  3.00 

06/04/2014  4.00 

06/04/2014  10.00 

06/04/2014  11.00 


Paper 


Kang  Li,  Yun  Fu.  Prediction  of  Human  Activity  by  Discovering  Temporal  Sequence  Patterns, 

IEEE  Transactions  on  Pattern  Analysis  and  Machine  Intelligence,  (01  2014):  0.  doi:  10.1 109/TPAMI. 
2013.2297321 

Ming  Shao,  Dmitry  Kit,  Yun  Fu.  Generalized  Transfer  Subspace  Learning  Through  Low-Rank  Constraint, 
International  Journal  of  Computer  Vision,  (08  2014):  0.  doi:  10. 1007/s1 1263-014-0696-6 

Liangyue  Li,  Sheng  Li,  Yun  Fu.  Learning  low-rank  and  discriminative  dictionary  for  image  classification. 
Image  and  Vision  Computing,  (03  2014):  0.  doi:  10.1016/j.imavis.2014.02.007 

Ya  Su,  Sheng  Li,  Shengjin  Wang,  Yun  Fu.  Submanifold  Decomposition, 

IEEE  Transactions  on  Circuits  and  Systems  for  Video  Technology,  (06  2014):  0.  doi: 

Yuan  Yao,  Yun  Fu.  Contour  Model  Based  Hand-Gesture  Recognition  Using  Kinect  Sensor, 

Circuits  and  Systems  for  Video  Technology,  IEEE  Transactions  on  ,  (01  2014):  0.  doi: 


TOTAL: 


5 


Number  of  Papers  published  iu  peer-reviewed  jouruals: 

(b)  Papers  published  in  non-peer-reviewed  journals  (N/A  for  none) 


Received  Paper 


TOTAL: 


Number  of  Papers  published  iu  uou  peer-reviewed  jouruals: 


(c)  Presentations 


Number  of  Preseutatious:  0.00 


Non  Peer-Reviewed  Conference  Proceeding  publications  (other  than  abstracts): 


Received  Paper 


TOTAL: 


Number  of  Non  Peer-Reviewed  Conference  Proceeding  publications  (other  than  abstracts): 

Peer-Reviewed  Conference  Proceeding  publications  (other  than  abstracts): 


Received  Paper 


06/04/2014  5.00  Sheng  Li,  Yun  Fu.  Robust  Subspace  Discovery  through  Supervised  Low-Rank  Constraints, 

2014  SIAM  International  Conference  on  Data  Mining.  26-APR-14,  .  Philadelphia,  PA:  Society  for  Industrial 
and  Applied  Mathematics,  Society  for  Industrial  and  Applied  Mathematics 

06/04/2014  6.00  Sheng  Li,  Peng  Li,  Yun  Fu.  Understanding  3D  human  torso  shape  via  manifold  clustering, 

SPIE  Defense,  Security,  and  Sensing.  29-APR-13,  Baltimore,  Maryland,  USA.  :  , 

06/04/2014  7.00  Xu  Zhao,  Yuncai  Liu,  Yun  Fu.  Exploring  discriminative  pose  sub-patterns  for  effective  action  classification, 
21st  ACM  international  conference  on  Multimedia.  21-OCT-13,  Barcelona,  Spain.  :  , 

06/04/2014  8.00  Ming  Shao,  Liangyue  Li,  Yun  Fu.  What  Do  You  Do?  Occupation  Recognition  in  a  Photo  via  Social 
Context, 

2013  IEEE  International  Conference  on  Computer  Vision  (ICCV).  01-DEC-13,  Sydney,  Australia.  :  , 

06/04/2014  9.00  Yizhe  Zhang,  Ming  Shao,  Edward  K.  Wong,  Yun  Fu.  Random  Faces  Guided  Sparse  Many-to-One 
Encoder  for  Pose-Invariant  Face  Recognition, 

2013  IEEE  International  Conference  on  Computer  Vision  (ICCV).  01-DEC-13,  Sydney,  Australia.  :  , 

06/04/2014  12.00  Sheng  Li,  Ming  Shao,  Yun  Fu.  Locality  Linear  Fitting  One-class  SVM  with  Low-Rank  Constraints  for 
Outlier  Detection, 

International  Joint  Conference  on  Neural  Networks  (IJCNN).  06-JUL-14,  .  :  , 

06/04/2014  13.00  Shuyang  Wang,  Jinzheng  Sha,  Huaiyu  Wu,  Yun  Fu.  Hierarchical  Facial  Expression  Animation  by  Motion 
Capture  Data, 

IEEE  International  Conference  on  Multimedia  and  Expo  .  14-JUL-14,  .  :  , 

06/04/2014  14.00  Chengcheng  Jia,  Guoqiang  Zhong,  Yun  Fu.  Low-Rank  Tensor  Learning  with  Discriminant  Analysis 
forAction  Classification  and  Image  Recovery, 

Twenty-Eighth  AAAI  Conference  on  Artificial  Intelligence.  28-JUL-14,  .  :  , 

TOTAL:  8 


Number  of  Peer-Reviewed  Conference  Proceeding  publications  (other  than  abstracts): 


(d)  Manuscripts 


Received  Paper 


TOTAL: 


Number  of  Manuscripts: 


Books 


Received  Book 


06/03/2014  1.00  Yun  Fu,  Yunqian  Ma.  Graph  Embedding  for  Pattern  Analysis,  New  York:  Springer,  (12  2013) 
TOTAL:  1 


Received  Book  Chapter 


TOTAL: 


Patents  Submitted 


Patents  Awarded 


Awards 

-ONR  Young  Investigator  Award 
Office  of  Naval  Research  (ONR),  2014 


-ARO  Young  Investigator  Award 
Army  Research  Office  (ARO),  2014 

-INNS  Young  Investigator  Award 

International  Neural  Networks  Society  (INNS),  2014 

-SDM  Best  Paper  Award 

SIAM  International  Conference  on  Data  Mining  (SDM)  Best  Paper  Award,  2014 


Graduate  Students 


NAME 

Chengcheng  Jia 
Shuyang  Wang 

FTE  Equivalent: 
Total  Number: 


Names  of  Post  Doctorates 

NAME  PERCENT  SUPPORTED 

FTE  Equivalent: 

Total  Number: 


Names  of  Faculty  Supported 

NAME  PERCENT  SUPPORTED  National  Academy  Member 

0.10 

0.10 

1 


Names  of  Under  Graduate  students  supported 


Student  Metrics 

This  section  only  applies  to  graduating  undergraduates  supported  by  this  agreement  in  this  reporting  period 

The  number  of  undergraduates  funded  by  this  agreement  who  graduated  during  this  period: .  0.00 

The  number  of  undergraduates  funded  by  this  agreement  who  graduated  during  this  period  with  a  degree  in 

science,  mathematics,  engineering,  or  technology  fields: . 0.00 

The  number  of  undergraduates  funded  by  your  agreement  who  graduated  during  this  period  and  will  continue 

to  pursue  a  graduate  or  Ph.D.  degree  in  science,  mathematics,  engineering,  or  technology  fields; . 0.00 

Number  of  graduating  undergraduates  who  achieved  a  3.5  GPA  to  4.0  (4.0  max  scale): . 0.00 

Number  of  graduating  undergraduates  funded  by  a  DoD  funded  Center  of  Excellence  grant  for 

Education,  Research  and  Engineering: . g.OO 

The  number  of  undergraduates  funded  by  your  agreement  who  graduated  during  this  period  and  intend  to  work 

for  the  Department  of  Defense . 0.00 

The  number  of  undergraduates  funded  by  your  agreement  who  graduated  during  this  period  and  will  receive 

scholarships  or  fellowships  for  further  studies  in  science,  mathematics,  engineering  or  technology  fields: . 0.00 

Names  of  Personnel  receiving  masters  degrees 

NAME 


Yun  Fu 

FTE  Equivalent: 
Total  Number: 


PERCENT  SUPPORTED  Discipline 
1.00 
1.00 

2.00 


Total  Number: 


Names  of  personnel  receiving  PHDs 

NAME 

Total  Number: 

Names  of  other  research  staff 

NAME  PERCENT  SUPPORTED 

FTE  Equivalent: 

Total  Number: 

Sub  Contractors  (DD882) 

Inventions  (DD882) 


See  Attachment 


Scientific  Progress 
Technology  Transfer 


DEPARTMENT  OF  THE  ARMY 

UNITED  STATES  ARMY  RESEARCH  LABORATORY 
ARMY  RESEARCH  OFFICE 
P.  O.  BOX  12211 

RESEARCH  TRIANGLE  PARK  NC  27709-2211 

Final  Report:  Scientific  Progress  and  Accomplishments 
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1.  Statement  of  the  Problem  Studied 

Periodically,  the  US  Army  conducts  detailed  measurement  surveys  of  its 
soldiers  as  a  way  to  understand  the  impact  that  changes  in  soldier  body 
size  have  for  the  design,  fit  and  sizing  of  virtually  every  piece  of 
clothing  and  equipment  that  Soldiers  wear  and  use  in  combat.  Recently 
finished  US  Army  Antbropometric  Survey  (ANSUR  II)  bas  collected  3D 
body  scan  data  of  soldiers  at  the  Natick  Solider  Center  (NSC),  as  shown 
in  Figure  1.  By  applying  new  techniques  for  shape  analysis  and 
classification  to  these  3D  body  scan  data  will  help  designers  of  clothing 
and  personal  protection  equipment  to  understand  and  fit  Army 
population.  The  overall  research  goal  of  this  proposal  is  to  create  a  new 
manifold  learning  framework  for  large-scale  graph  decomposition  and 
approximation  problems  by  low-rank  approximation  and  guarantee 
computable,  stable  and  fast  optimizations  for  3D  shape  description  and 
classification. 

Traditional  body  shape  description  is  based  on  a  few  anthropometric 
measurements,  mostly  chest,  waist  and  hip  circumferences.  These 
simple  measurements  cannot  completely  capture  three-dimensional  shape  variation  of  the  human 
body.  Current  progress  in  3D  scanning  technology  made  capture  of  3D  shape  of  human  body  possible. 
Although  researches  have  been  conducted  in  shape  description  and  retrieval  of  the  human  body,  there 
is  no  report  in  successful  3D  shape  classification  direction.  It  is  believed  the  classification  is  important 
because  it  will  benefit  design  of  garment,  sportswear,  personal  protection  clothing  and  equipment, 
office  and  health  care  device,  etc.  Therefore  it  is  desirable  to  develop  an  effective  shape  descriptor  and 
robust  shape  classification  method  for  3D  human  body  surface  data.  Moreover,  large  scale  (big)  3D 
shape  data,  such  as  the  ANSUR  II,  may  cause  intractable  computational  complexity,  especially  for 
graph  decomposition  based  machine  learning  methods,  e.g.  manifold  learning,  graph  embedding, 
subspace  learning,  clustering. 

2.  Scientific  Accomplishments 

The  Pi’s  group  has  published  (or  accepted  for  publication)  1  book  through  Springer  and  13  scientific 
papers  partially  supported  by  this  grant.  In  particular,  these  papers  are  in  top  journals  and  conference 
proceedings  such  as  TPAMI,  UCV,  TCSVT,  ICCV,  AAAI,  SDM,  ACM  MM,  etc.  One  paper,  I  out  of 
384,  receives  the  Best  Paper  Award  in  SDM  2014.  The  PI,  Dr.  Y.  Raymond  Fu  has  received  the 
2014  INNS  Young  Investigator  Award,  from  International  Neural  Networks  Society  (INNS),  2014. 
Leveraged  by  this  grant,  the  PI  has  been  granted  an  ARO  Young  Investigator  Program  (YIP)  Award 
with  a  title  of  “Intention  Sensing  through  Video-based  Imminent  Activity  Prediction”,  and  a  Defense 
University  Research  Instrumentation  Program  (DURIP)  award  with  a  title  of  “3D  Data  Acquisition 
Platform  for  Human  Activity  Understanding”. 


Figure  1: 3D  Human  Scan. 
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3.  Summary  of  Results 

Discovering  the  variations  in  human  torso  shape  plays  a  key  role  in  many  design-oriented  applications, 
such  as  suit  designing.  With  recent  advances  in  3D  surface  imaging  technologies,  people  can  obtain  3D 
human  torso  data  that  provide  more  information  than  traditional  measurements.  However,  how  to  find 
different  human  shapes  from  3D  torso  data  is  still  an  open  problem.  From  this  STIR  project,  we  have 
created  a  new  algorithmic  tool  set  of  modeling  large-scale  high  dimensional  data  [1-14].  For  uncertainty 
visual  representation,  we  proposed  a  class  of  manifold  and  subspace  learning  methods  [1]  including 
submanifold  decomposition  [4],  manifold  clustering  [11],  deep  learning  [14],  one-class  classification 
[10],  low-rank  and  discriminative  dictionary  learning  [6],  robust  low-rank  subspace  discovery  [7],  low- 
rank  tensor  completion  [8]  and  low-rank  transfer  subspace  learning  [3].  We  also  proposed  applications 
of  these  techniques  to  analyzing  spatial -temporal  patterns  of  human  motion,  action,  and  activity  [2],  3D 
hand-gesture  recognition  [5],  expression  animation  by  motion  capture  [9],  3D  human  torso  shape 
understanding  [11],  discriminative  pose  sub-patterns  [12],  human  gesture  in  social  context  [13].  This 
report  will  highlight  the  details  in  [4,  7,  8,  11]. 

3.1  Manifold  Modeling 
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Figure  2:  Toy  example  for  submanifold  decomposition,  (a)  The  original  data  on  the  top  are  fused  by  two 
uncorrelated  manifolds,  blue  and  red  respectively.  Any  point  ( denoted  by  star)  is  affected  by  and  belongs  to  both 
of  the  two  manifolds,  (b)  Traditional  manifold  learning  algorithms  can  only  extract  a  manifold  based  on  nearest 
neighborhood  relationship,  (c)  Multiple  manifold  learning  algorithms  learn  two  manifolds  that  are  separated  in 
distance,  and  assume  any  point  can  only  subject  to  one  manifold,  (d)  SMD  learns  two  manifolds  simultaneously. 


Low-dimensional  structures  embedded  in  high  dimensional  data  space  can  be  extracted  by  spectral 
analysis  and  manifold  learning.  Standard  approaches  for  manifold  learning  are  usually  based  on  the 
assumption  that  there  is  a  dominant  low-dimensional  manifold,  while  other  variations  are  considered 


with  minor  priority.  We  instead  consider  the  scenario  that  a  pair  of  distinct  manifolds  intertwined  in 
the  same  high-dimensional  space  which  can  he  decomposed  for  analysis.  The  novel  submanifold 
decomposition  (SMD)  algorithm,  shown  in  Figure  2,  has  three  contrihutions:  1)  A  submanifold 
framework  is  proposed  to  model  the  high-dimension  dataset  which  is  dominated  by  more  than  one 
factor;  2)  A  nonlinear  manifold  decomposition  method  is  presented  to  extract  two  intertwined 
manifolds  from  a  dataset  in  a  discriminative  manner,  and  3)  hi  order  to  solve  the  “Out-of-Sample” 
problem  of  nonlinear  SMD,  a  linear  extension  of  SMD  is  developed  which  is  effective  to  extract  two 
linear  submanifolds.  We  demonstrated  that  comparing  with  existing  manifold  learning  methods  that 
only  extract  one  dominant  manifold,  the  proposed  SMD  and  its  linear  extension  are  capable  of 
extracting  a  pair  of  submanifolds  discriminatively  and  effectively.  Moreover,  the  two  extracted 
manifolds  can  complement  each  other  to  enhance  the  representation  performance. 

Articulated  configuration  of  human  body  parts  is  an  essential  representation  of  human  shape  and 
motion,  therefore  is  well  suited  for  classifying.  We  proposed  a  novel  approach  to  exploring  the 
discriminative  pose  sub-patterns  based  on  the  submanifold  decomposition  assumption.  These  pose 
sub-patterns  are  extracted  from  a  predefined  set  of  3D  poses  represented  by  hierarchical  motion 
angles.  The  basic  idea  is  motivated  by  the  two  observations:  (1)  There  exist  representative  sub¬ 
patterns  in  each  action  class,  from  which  the  action  class  can  be  easily  differentiated.  (2)  These  sub¬ 
patterns  frequently  appear  in  the  action  class.  By  constructing  a  connection  between  frequent  sub¬ 
patterns  and  the  discriminative  measure,  we  developed  the  Support  Sub-Pattern  Induced  learning 
algorithm  for  simultaneous  feature  selection  and  feature  learning.  The  generalization  capability  of  this 
new  model  is  inductive  enough  to  extend  the  application  to  ARMY  video  data. 

3.2  Low-Rank  Manifold  Modeling 


Subspace  learning  can  facilitate  manifold  modeling  for  feature  extraction  and  classification.  However, 
its  performance  would  be  heavily  degraded  when  data  are  corrupted  by  large  amounts  of  noise. 
Inspired  by  recent  work  in  matrix  recovery,  we  tackle  this  problem  by  exploiting  a  subspace  that  is 
robust  to  noise  and  large  variability  for  manifold  modeling.  Specifically,  we  propose  a  novel 
Supervised  Regularization  based  Robust  Subspace  (SRRS)  approach  via  low-rank  learning,  shown  in 
Figure  3.  Unlike  existing  subspace  methods,  our  approach  jointly  learns  low -rank  representations  and 
a  robust  subspace  from  noisy  observations.  At  the  same  time,  to  improve  the  classification 
performance,  class  label  information  is  incorporated  as  supervised  regularization.  The  problem  can 
then  be  formulated  as  a  constrained  rank  minimization  objective  function,  which  can  be  effectively 
solved  by  the  inexact  augmented  Lagrange  multiplier  (ALM)  algorithm.  Our  approach  differs  from 
current  sparse  representation  and  low-rank  learning  methods  in  that  it  explicitly  learns  a  low¬ 
dimensional  subspace  where  the  supervised  information  is  incorporated. 
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Figure  3:  Robust  low-rank  subspace  recovery.  We  jointly  remove  noise  from  data  X  and  learn  robust  subspace 
P.  The  corrupted  samples  are  mixed  in  the  original  space,  but  they  are  well  separated  in  the  learned  subspace. 

3.3  Low-Rank  Tensor  Manifold  Modeling 

Tensor  completion  can  further  facilitate  low -rank  manifold  modeling  for  extraction  of  the  intrinsic 
structure  of  the  multimode  data.  We  present  a  supervised  low-rank  tensor  completion  method  for 
dimensionality  reduction,  to  learn  an  optimal  suhspace  for  manifold  modeling.  Our  model  automatically 
learns  the  low  dimensionality  of  tensor,  opposed  to  manually  pre -defined,  as  other  dimensional 
reduction  methods.  Considering  the  underlying  structure  information  of  the  whole  high-dimensional 
dataset,  it  can  use  the  low-rank  learning  to  extract  the  structure  for  image  recovery,  while  integrating 
with  the  discriminant  analysis  criterion.  Figure  4  shows  the  framework  of  our  method  applied  to  the 
video-hased  action  classification.  We  first  select  a  training  set  from  an  action  video  database  to  learn  the 
low-rank  projection  matrices,  which  are  then  used  to  calculate  a  tensor  suhspace  for  the  action 
classification.  When  calculating  the  low-rank  projection  matrices,  we  adopt  a  discriminant  analysis 
criterion  as  a  regularizer  to  avoid  over -fitting.  Meanwhile,  with  this  discriminant  analysis  criterion, 
supervisory  information  is  seamlessly  integrated  in  the  low-rank  tensor  completion  model.  After 
projecting  the  original  training  and  testing  sets  to  the  learned  tensor  suhspace,  we  predict  the  labels  of 
the  test  video  sequences  with  a  K-nearest  neighbor  (EiNN)  classifier.  We  add  the  sample  information  to 
recovery  some  face  images  by  removing  different  illuminations.  The  contributions  of  this  work  are  1)  a 
new  discriminative  method  for  low-rank  tensor  completion,  which  automatically  learns  the  low 
dimensionality  of  the  tensor  subspace  for  feature  extraction;  2)  integration  of  the  discriminant  analysis 
criterion  in  the  low-rank  tensor  completion  model  based  on  the  given  supervisory  information;  3) 
extraction  of  the  underlying  structure  of  the  original  tensor  data  by  low-rank  learning,  which 
reconstructs  the  data  from  the  learned  tensor  subspace,  for  high-dimensional  image  recovery. 


Projection  matrices  by 
low-rank  discriminant  learning 


V 


V’ 


Figure  4:  Low-rank  tensor  manifold  modeling.  The  tensor  training  set  X  is  used  for  calculating  the  low-rank 
projection  matrices,  which  are  employed  for  subspace  alignment  of  training  and  testing  set  Y  and  Y’. 


3.4  Dataset 


The  Civilian  American  and  European  Surface  Anthropometry  Resource  (CAESAR)  database,  Eigure  5, 
contains  3D  scans,  seventy-three  anthropometry  landmarks,  and  traditional  measurements  data  of  5000 
subjects.  In  our  experiments,  we  select  the  torso  data  of  1,100  subjects  (about  650  male  and  450  female) 
from  CAESAR  database.  Eor  simplicity,  we  only  choose  the  standing  pose  of  every  subject,  and  convert 
the  scanned  data  of  each  subject  into  a  column  vector  before  evaluating  different  clustering  algorithms. 


Figure  5:  Image  Source:  http://srail.cs.washinston.edu/proiects/disital-human/pub/allen03space.pdf 


The  MSR  hand  gesture  3D  database  (Oreifej,  Liu,  and  Redmond  2013;  Wang  et  al.  2012)  contains  12 
classes  of  hand  gestures:  letter  “Z”,  “J”,  “Where”,  “Store”,  ‘Pig”,  “Past”,  “Hungary”,  “Green”,  “Finish”, 
“Blue”,  “Bathroom”,  and  “Milk”.  These  are  performed  hy  10  subjects,  with  each  subject  performs  2-3 
times.  There  are  total  of  333  samples,  each  is  an  action  video  consisting  of  a  depth  image  sequence.  We 
use  the  same  experimental  set-up  as  (Oreifej,  Liu,  and  Redmond  2013)  (Wang  et  al.  2012)  in  this 
experiment.  All  the  subjects  are  independent,  and  each  video  sequence  is  subsampled  to  be  the  size  of 
80x80x18.  The  image  dimension  is  sufficient  to  represent  the  gesture,  and  the  third  dimension  is  due  to 
the  least  number  of  the  video  sequence. 


The  MSR  action  3D  database  contains  20  classes  of  actions.  This  includes  “arm  waving”,  “horizontal 
waving”,  “hammer”,  “hand  catching”,  “punching”,  “throwing”,  “drawing  x”,  “drawing  circle”, 
“clapping”,  “two  hands  waving”,  “sideboxing”,  “bending”,  “forward  kicking,”  “side  kicking”,  “jogging”, 
“tennis  swing,”  “golf  swing,”  “picking  up  and  throwing”.  Each  action  is  performed  by  10  subjects,  each 
performing  2-3  times.  There  are  567  samples  in  total.  The  action  video  is  represented  as  a  high¬ 
dimensional  tensor  in  this  experiment.  In  the  following,  we  report  two  sets  of  results  performed  under 
different  experimental  settings. 


3.5  Evaluation  Results 

In  the  evaluation,  we  propose  to  use  spectral  clustering  approach  on  torso  manifold  for  analyzing  the 
shape  variations  in  3D  human  torso  data.  In  particular,  the  high-dimensional  torso  data  are  first 
represented  in  a  low-dimensional  space  that  is  learnt  by  manifold  embedding  algorithms  such  as  LLE 
and  LE.  Then  we  perform  spectral  clustering  on  the  manifold  to  group  the  torso  data  points  into  several 
disjoint  subsets.  We  evaluate  the  performance  of  our  algorithm  on  the  CAESAR  3D  human  torso 
database.  Experimental  results  show  that  our  approach  achieves  better  performance  than  the  compared 
clustering  method,  and  the  cluster  centers  discovered  by  our  approach  can  describe  the  discrepancies  in 
both  genders  and  human  shapes. 


Reconstructed  Surfaces  of  3  Cluster  Centers 
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Figure  6:  Visualization  of  3 -cluster  manifold  clustering  results  on  male  (right)  and  female  torsos  (left). 


We  testify  our  algorithm  on  both  male  and  female  torsos.  The  traditional  K-Means  algorithm  on  the 
PCA  embedding  space  is  chosen  as  our  baseline.  We  observed  that  the  data  distribution  discovered  by 
PCA  is  not  smooth,  and  some  data  points  can  be  easily  regarded  as  outliers,  which  often  lead  to  negative 
effects  on  clustering.  Therefore,  the  boundaries  of  different  clusters  are  unclear,  and  there  is  even  an 
obvious  overlap  between  clusters.  As  a  result,  it  is  difficult  to  analyze  torso  shape  variations  based  on 
the  clustering  results. 

Figure  6  show  the  spectral  clustering  results  after  LE  embedding,  respectively.  They  demonstrate  that:  (1) 
manifold  learning  algorithms,  such  as  LEE  and  LE,  could  reveal  the  underlying  structure  of  3D  female 
torsos.  We  can  observe  that  the  manifold  is  smooth,  and  there  are  few  outliers.  (2)  our  clustering 
algorithm  successfully  groups  the  data  into  several  clusters,  and  there  is  almost  no  overlap  between 
different  clusters.  We  can  obtain  several  disjoint  clusters  using  our  proposed  algorithm.  How  to  analyze 
the  shape  variations  according  to  the  clustering  results?  A  rational  strategy  is  to  use  the  center  of  cluster 
as  the  prototype  for  each  cluster.  To  visualize  different  cluster  centers,  we  reconstruct  the  torso  surfaces 
according  to  the  original  3D  torso  data,  and  illustrate  3  cluster  centers  in  Eigure  6.  We  also  show  the 
average  heights  of  each  cluster.  It's  interesting  that  3  cluster  centers  display  significant  differences  in 
their  bust  girth  and  height,  and  their  sizes  are  increased  gently.  In  other  words,  we  are  able  to  discover 
the  shape  variations  among  a  large  group  of  torsos  when  given  a  proper  number  of  clusters  (i.e.,  the 
number  of  shape  types). 


3D  LE  Embedding  of  All  Torsos  (6  Clusters) 
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Figure  7:  Visualization  of  clustering  results  on  all  torsos  (6  Clusters) 


Another  interesting  question  is  that  can  we  discover  the  shape  variations  without  the  gender  information. 
We  evaluate  the  performance  of  our  algorithm  on  all  the  torsos.  Eigure  7  shows  the  clustering  results  of 
all  torsos  after  manifold  embedding  when  the  numbers  of  clusters  are  6  and  12.  We  can  observe  from 
Eigure  7  that  male  data  and  female  data  lie  in  two  different  manifold  structures.  When  the  number  of 
clusters  comes  to  12,  our  algorithm  can  distinguish  both  gender  and  shape  types. 

Table  1-3  show  the  accuracy  of  different  methods  on  the  MSR  3D  databases.  It  should  be  evident  that, 
the  proposed  method  performs  better  than  the  state-of-the-art  low-rank  tensor  representation  learning 
methods.  HON4D-i-Ddisc  (Oreifej,  Liu,  and  Redmond  2013)  is  the  latest  work  on  the  gesture  database 
using  normal  orientation  histogram.  Zhang  et  al.’s  work  (Zhang  et  al.  2013)  proposed  to  rectify  align 
images  with  distortion  and  partial  missing,  which  used  image  sequence  after  low-rank  learning  in  this 
experiment.  It  had  lower  accuracy  than  our  method,  as  it  relies  on  the  original  images  and  can  deal 
with  the  trivial  changing,  such  as  sparse  noise,  small  fragment,  and  distortion;  while  it  is  not  suitable 
for  the  large  scale  of  movement,  distortion  or  rotation  in  the  gesture  classification  task.  Zhong  and 
Cheriet’s  method  is  less  effective  when  compared  with  ours.  In  the  Test  One  and  Cross  Subject  sets 
our  method  performs  best.  In  the  Test  Two  set,  we  have  an  accuracy  just  2%  lower  than  Chen  et  al.’s 


method.  For  Zhang  et  al’s  (Zhang  et  al.  2013)  work,  we  used  entire  images  in  the  database,  i.e.,  10  x 
567  =  5670  images.  Still,  it  was  able  to  deal  with  the  trivial  sparse  noise  or  distortion,  such  as  the  digit 
’3’  in  their  test  experiment  (Zhang  et  al.  2013).  However,  the  action  video  containing  large  scale 
movements  in  the  arms  or  legs,  making  it  not  suitable  for  this  application. 


Method 

Accuracy  ‘X 

HON4D  -\-Ddisc 

92.45 

HON4D 

87.29 

Zhang  et  al. 

89.93 

Zhong  et  al. 

69.44 

LRTD 

99.09 

Table  1:  Results  for  the 
MSR  gesture  database. 


Table  3:  Accuracy  (%)  of  3  sets  on  the  MSR  action  database. 


Table  2:  Results  for  the 
MSR  action  database. 


Method 

Accuracy  ‘X 

HON4D  +Ddisc 

88.89 

HON4D 

85.85 

Zhang  et  al. 

95.96 

Zhong  et  al. 

92.88 

LRTD 

98.50 

Chen 

Zhang 

Zhong 

Ours 

ASl 

97.3 

46.67 

92.76 

99.34 

Test  One 

AS2 

96.1 

47.71 

98.08 

99.36 

AS3 

98.7 

11.33 

80.26 

99.34 

Average 

97.4 

35.24 

90.37 

99.35 

ASl 

98.6 

45.95 

77.63 

98.68 

Test  Two 

AS2 

98.7 

47.24 

91.03 

97.44 

AS3 

too 

10.81 

90.79 

96.05 

Average 

99.1 

34.67 

86.48 

97.39 

Cross  Subject 

ASl 

96.2 

44.35 

91.67 

98.33 

AS2 

83.2 

46.16 

85.83 

97.50 

Test 

AS3 

92.0 

10.81 

85.83 

99.17 

Average 

90.5 

33.78 

87.78 

98.33 

3.6  Conclusion 

From  the  above  experimental  results,  we  can  conclude  that  the  proposed  clustering  algorithm  could 
automatically  discover  different  shape  types.  First,  we  show  on  both  female  and  male  torsos  that 
manifold  embedding  algorithms  can  reveal  the  underlying  structures  of  high-dimensional  torso  data. 
Second,  compared  with  the  baseline  method,  K-Means,  our  algorithm  clearly  groups  the  torsos  into 
several  disjoint  subsets.  In  addition,  if  the  gender  information  is  unknown,  our  algorithm  can  also 
distinguish  both  gender  and  shape  types  by  virtue  of  manifold  clustering.  An  open  question  is  how  to 
design  a  strategy  to  determine  the  optimal  number  of  clusters,  which  could  be  considered  in  our  future 
work.  Results  on  the  MSR  hand  gesture  3D  database  and  the  MSR  action  3D  database  have  shown  that 
our  method  performs  better  than  the  state-of-the-art  low-rank  tensor  representation  learning  methods. 

4.  Scientific  Significance 

By  investigating  novel  low -rank  matrix  approximation  methodologies  in  a  new  manifold  learning 
framework  coupled  with  graph  embedding  and  subspace  learning,  the  proposed  research  seeks  to 
advance  basic  understanding  and  visual  representation  of  large-scale  3D  scan  data,  and  will  allow  for 
important  advances  in  fundamental  computer  vision  and  pattern  classification  research.  Such  low-rank 
graph  approximation  by  divide-approximate -combine  and  graph-based  locality  preserving  hashing  for 
enhancing  the  computability  of  large-scale  tasks  would  be  a  significant  contribution,  which  will 
guarantee  computable,  stable  and  fast  optimizations.  This  project  will  be  a  collaborative  research  with 
the  Natick  Soldier  Research,  Development  and  Engineering  Center  (NSRDEC),  which  may  significantly 
facilitate  and  advance  the  ongoing  US  Army  Anthropometric  Survey.  Such  progresses  will  significantly 
advance  the  visual  intelligence  field  and  contribute  to  the  accomplishment  of  the  Army’s  mission. 

5.  Future  Research  Plans 

The  project  starts  by  building  databases  from  the  collaborators  in  NSC;  then  comprehensively 
investigates  the  proposed  new  methodologies;  finally  validates  the  3D  body  shape  clustering 
applications.  It  is  strongly  believed  that  the  theoretical  contribution  of  this  research  may  pave  the 
foundation  for  novel  techniques  in  solving  important  problems  of  visual  understanding  and  large-scale 
visual  analytics.  Such  research  endeavor  is  sustainable  and  can  go  well  beyond  the  9-month  scope. 


which  is  the  Pi's  long-term  career  goal.  The  leveraged  ARO  YIP  award  and  DURIP  award  are  concrete 

examples  of  future  research  plan  within  the  Pi’s  key  research  interests. 
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